BOPS, Not FLOPS! A New Metric, Measuring Tool, and Roofline Performance Model For Datacenter Computing

نویسندگان

  • Lei Wang
  • Jianfeng Zhan
  • Wanling Gao
  • Rui Ren
  • Xiwen He
  • Chunjie Luo
  • Gang Lu
  • Jingwei Li
چکیده

The past decades witness FLOPS (Floating-point Operations per Second), as an important computation-centric performance metric, guides computer architecture evolution, bridges hardware and software co-design, and provides quantitative performance number for system optimization. However, for emerging datacenter computing (in short, DC) workloads, such as internet services or big data analytics, previous work reports on the modern CPU architecture that the average proportion of floating-point instructions only takes 1% and the average FLOPS efficiency is only 0.1%, while the average CPU utilization is high as 63%. These contradicting performance numbers imply that FLOPS is inappropriate for evaluating DC computer systems. To address the above issue, we propose a new computation-centric metric BOPS (Basic OPerations per Second). In our definition, Basic Operations include all of arithmetic, logical, comparing and array addressing operations for integer and floating point. BOPS is the average number of BOPs (Basic OPerations) completed each second. To that end, we present a dwarf-based measuring tool to evaluate DC computer systems in terms of our new metrics. On the basis of BOPS, also we propose a new roofline performance model for DC computing. Through the experiments, we demonstrate that our new metrics–BOPS, measuring tool, and new performance model indeed facilitate DC computer system design and optimization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor

The Roofline Performance Model is a visually intuitive method used to bound the sustained peak floating-point performance of any given arithmetic kernel on any given processor architecture. In the Roofline, performance is nominally measured in floating-point operations per second as a function of arithmetic intensity (operations per byte of data). In this study we determine the Roofline for the...

متن کامل

MEASURING SOFTWARE PROCESSES PERFORMANCE BASED ON FUZZY MULTI AGENT MEASUREMENTS

The present article discusses and presents a new and comprehensive approachaimed at measuring the maturity and quality of software processes. This method has beendesigned on the basis of the Software Capability Maturity Model (SW-CMM) and theMulti-level Fuzzy Inference Model and is used as a measurement and analysis tool. Among themost important characteristics of this method one can mention si...

متن کامل

Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels

Achieving optimal program performance requires deep insight into the interaction between hardware and software. For software developers without an indepth background in computer architecture, understanding and fully utilizing modern architectures is close to impossible. Analytic loop performance modeling is a useful way to understand the relevant bottlenecks of code execution based on simple ma...

متن کامل

Definition of General Operator Space and The s-gap Metric for Measuring Robust Stability of Control Systems with Nonlinear Dynamics

In the recent decades, metrics have been introduced as mathematical tools to determine the robust stability of the closed loop control systems. However, the metrics drawback is their limited applications in the closed loop control systems with nonlinear dynamics. As a solution in the literature, applying the metric theories to the linearized models is suggested. In this paper, we show that usin...

متن کامل

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis

We present preliminary results of the Roofline Toolkit for multicore, manycore, and accelerated architectures. This paper focuses on the processor architecture characterization engine, a collection of portable instrumented micro benchmarks implemented with Message Passing Interface (MPI), and OpenMP used to express thread-level parallelism. These benchmarks are specialized to quantify the behav...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1801.09212  شماره 

صفحات  -

تاریخ انتشار 2018